-
Notifications
You must be signed in to change notification settings - Fork 62
Add more options to df_engine docker build, slim down features for some packages #1540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more options to df_engine docker build, slim down features for some packages #1540
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1540 +/- ##
==========================================
+ Coverage 83.44% 83.52% +0.08%
==========================================
Files 428 428
Lines 119043 119720 +677
==========================================
+ Hits 99338 100000 +662
- Misses 19171 19186 +15
Partials 534 534
🚀 New features to boost your workflow:
|
| azure_identity = { workspace = true, optional = true } | ||
| flate2 = { workspace = true, optional = true } | ||
| reqwest = { workspace = true, optional = true } | ||
| reqwest = { workspace = true, optional = true, default-features = false, features = ["rustls-tls"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments here -
- No need to repeat
[default-features = false]in the crate if the workspace already sets it. - It’s better to let the crate manage its own reqwest features instead of forcing them at the workspace level.
- Pick one of
rustls-tlsorrustls-tls-native-roots. They both pull in their own CA bundles, so using both at once doesn’t help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review!
No need to repeat [default-features = false] in the crate if the workspace already sets it.
Good to know! Removed it in the otap crate.
It’s better to let the crate manage its own reqwest features instead of forcing them at the workspace level.
Makes perfect sense, removed at the workspace level.
Pick one of rustls-tls or rustls-tls-native-roots. They both pull in their own CA bundles, so using both at once doesn’t help.
Good catch, I meant to pick rustls-tls not rustls-tls-native-roots and forgot to update both places. Removed the workspace level feature anyways now, so we just have rustls-tls in the otap crate.
|
One thing to keep in mind: the PR switches the TLS backend from |
Yeah absolutely, my goal here is just to try to get things working in the context of the existing docker build. We actually have a couple of other challenges on the Microsoft side related to crypto with this project, because I started trying to get arrow-rs-object-store to allow for pluggable crypto a few months ago rather than just swap from |
This PR addresses a few problems that I ran into getting
df_enginerunning in my kubernetes environment. The end result of all this is that you can now run a docker build like this:docker build --build-arg FEATURES="azure" --build-arg RUSTFLAGS="-C target-feature=-gfni" --build-arg DEBUG=1 --build-context otel-arrow=../../ -f Dockerfile -t df_engine .TLS
Some of the Azure SDK crates + reqwest have a lot of default features enabled that we are not using and which rely on linking with openssl for TLS. This was giving me trouble with our musl build.
I've disabled default features for these packages at the workspace level and opted for using
rustls-tlswith reqwest in theotapcrate.Configuring the docker build
FEATURESbuild arg to the dockerfile and the cross platform build script so that you can turn on features likeazurewhen buildingRUSTFLAGSfor the docker build so that we can turn some features on or off when compiling which brings me to my next item...Adding some optional debug utils to the image
When running on k8s I was getting very mysterious crashes after a couple of seconds with 0 logs to stdout. Remoting into the container and running the binary gave me the "Illegal instruction (core dumped)" message, which was a start.
After checking the obvious vector extensions like SSE/AVX/AVX512, I was stumped. So, I added gdb to the container and found the exact instruction which is the longest mnemonic I have ever seen (comes from arrow's buffer crate):
Turns out my laptop has support for a specific feature called
gfni, which the nodes with Intel Xeon processors that I was running did not.I think this will not be the last time someone needs to debug a native code/cross-compilation problem, so I added an option to the build called
DEBUGthat will includegdb(and maybe other debug utilities in the future) in the image.